Network of two-Chinese-character compound words in Japanese language

نویسندگان

  • Ken Yamamoto
  • Yoshihiro Yamazaki
چکیده

Some statistical properties of a network of two-Chinese-character compound words in Japanese language are reported. In this network, a node represents a Chinese character and an edge represents a two-Chinese-character compound word. It is found that this network has properties of “small-world” and “scale-free.” A network formed by only Chinese characters for common use (joyo-kanji in Japanese), which is regarded as a subclass of the original network, also has small-world property. However, a degree distribution of the network exhibits no clear power law. In order to reproduce disappearance of the power-law property, a model for a selecting process of the Chinese characters for common use is proposed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structure and modeling of the network of two-Chinese-character compound words in the Japanese language

Abstract This paper proposes a numerical model of the network of two-Chinese-character compound words (two-character network, for short). In this network, a Chinese character is a node and a twoChinese-character compound word links two nodes. The basic framework of the model is that an important character gets many edges. As the importance of a character, we use the frequency of each character ...

متن کامل

Normal and impaired reading of Japanese kanji and kana

Two kinds of scripts are used in the written forms of Japanese words: morphographic kanji and phonographic kana. Whereas each kana character invariably represents a single pronunciation, the majority of kanji characters have two or more legitimate pronunciations, with one appropriate to the character in any given word. Furthermore, each kanji character has meaning while a kana character does no...

متن کامل

Character Decomposition and Transposition Processes in Chinese Compound Words Modulates Attentional Blink

The attentional blink (AB) is the phenomenon in which the identification of the second of two targets (T2) is attenuated if it is presented less than 500 ms after the first target (T1). Although the AB is eliminated in canonical word conditions, it remains unclear whether the character order in compound words affects the magnitude of the AB. Morpheme decomposition and transposition of Chinese t...

متن کامل

ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer

While morphological information has been demonstrated to be useful for various Chinese NLP tasks, there is still a lack of complete theories, category schemes, and toolkits for Chinese morphology. This paper focuses on the morphological structures of Chinese bi-character words, where a corpus were collected based on a welldefined morphological type scheme covering both Chinese derived words and...

متن کامل

A Semantic Approach to Kanji Lexicography

The Japanese script consists of two phonetic syllabaries, called hiragana (eg ft* /ka/) and katakana (eg # /ka/), and thousands of Chinese characters, called kanji (eg ^ hon). Chinese characters have three basic properties: form, sound, and meaning. Many characters are of complex shape, some having more than twenty or even thirty strokes. Each character may be pronounced according to its Chines...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0902.4060  شماره 

صفحات  -

تاریخ انتشار 2009